Identifying Basic Patterns of Korean Natural Language

نویسندگان

Jinseok Chae

Sukho Lee

چکیده

Korean natural language queries are composed of a number of basic building blocks. This paper describes the process to identify the basic patterns considered to be basic building blocks constructing Korean queries. Two sets of Korean queries generated by two groups of senior-level students were experimented. Questions from the rst set were produced by students who had no knowledge about databases and schema. Students from the second group attended a short lecture and understood that the Korean queries would be executed by a computer. By analyzing these experimental queries, seven basic patterns are identi-ed. Korean queries combined by these basic patterns cover more than 80% of all questions.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Machine Translation Researches and Governmental View in Korea

Viewed from a broad perspective, in the seventies when we studied the basic technologies of NLP as a groundwork of MT, the focus of research was given to describing various phenomena of the Korean language in a linguistically significant way and processing the Korean characters mathematically or specific phenomena of the language logically with a computer. The theoretical linguistic description...

متن کامل

Segmentation Granularity in Dependency Representations for Korean

Previous work on Korean language processing has proposed different basic segmentation units. This paper explores different possible dependency representations for Korean using different levels of segmentation granularity — that is, different schemes for morphological segmentation of tokens into syntactic words. We provide a new Universal Dependencies (UD)-like corpus based on different levels o...

متن کامل

Text Mining: Extraction of Interesting Association Rule with Frequent Itemsets Mining for Korean Language from Unstructured Data

Text mining is a specific method to extract knowledge from structured and unstructured data. This extracted knowledge from text mining process can be used for further usage and discovery. This paper presents the method for extraction information from unstructured text data and the importance of Association Rules Mining, specifically for of Korean language (text) and also, NLP (Natural Language ...

متن کامل

A Constrained Finite-State Morphotactics for Korean

Abstract In this paper, we propose a constrained finite-state model, named cfsm, for Korean morphotactics and attempt to show how it can successfully treat some major morphological problems in Korean. As a preliminary descriptive framework, this model adopts the Korean morphological system Komor by Lee (1999) to lay out some basic problems in Korean morphotactics and describe linear approaches ...

متن کامل

A Comparison of Two Variant Corpora: The Same Content with Different Source

Abstract In order to investigate the effect of source language on translations, we investigate two variants of a Korean translation corpus. The first variant consists of Korean translations of 162,308 Japanese sentences from the ATR BTEC (Basic Expression Text Corpus). The second variant was made by translating the English translations of the Japanese sentences into Korean. We show that the sou...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2007

Identifying Basic Patterns of Korean Natural Language

نویسندگان

چکیده

منابع مشابه

The Machine Translation Researches and Governmental View in Korea

Segmentation Granularity in Dependency Representations for Korean

Text Mining: Extraction of Interesting Association Rule with Frequent Itemsets Mining for Korean Language from Unstructured Data

A Constrained Finite-State Morphotactics for Korean

A Comparison of Two Variant Corpora: The Same Content with Different Source

عنوان ژورنال:

اشتراک گذاری